• Home
  • Feature selection
  • Published Issues

    OpenAccess
    • List of Articles Feature selection

      • Open Access Article

        1 - Ensemble Feature Selection Strategy Based on Hierarchical Clustering in Electronic Nose
        M. A. Bagheri Gh. A. Montazer
        The redundancy problem of sensor response in electronic noses is still remarkable due to the cross-selectivity of chemical gas sensors which can degrade the classification performance. In such situations, a more efficient multiple classifier system can be obtained in ra More
        The redundancy problem of sensor response in electronic noses is still remarkable due to the cross-selectivity of chemical gas sensors which can degrade the classification performance. In such situations, a more efficient multiple classifier system can be obtained in random feature space rather than in the original one. Ensemble Feature Selection (EFS) methods assume that there is redundancy in the overall feature set and better performance can be achieved by choosing different subsets of input features for multiple classifiers. By combining these classifiers the higher recognition rate can be achieved. In this paper, we propose a feature subset selection method based on hierarchical clustering of transient features in order to enhance the classifier diversity and efficiency of learning algorithms. Our algorithm is tested on the UCI benchmark data sets and then used to design an odor recognition system. The experimental results of proposed method based on hierarchical clustering feature subset selection and multiple classifier system demonstrate the more efficient classification performance. Manuscript profile
      • Open Access Article

        2 - An Intelligent BGSA Based Method for Feature Selection in a Persian Handwritten Digits Recognition System
        N. Ghanbari S. M. Razavi S. H. Nabavi Karizi
        In this paper, an intelligent feature selection method for recognition of Persian handwritten digits is presented. The fitness function associated with the error in the Persian handwritten digits recognition system is minimized, by selecting the appropriate features, us More
        In this paper, an intelligent feature selection method for recognition of Persian handwritten digits is presented. The fitness function associated with the error in the Persian handwritten digits recognition system is minimized, by selecting the appropriate features, using binary gravitational search algorithm. Implementation results show that the use of intelligent methods is well able to choose the most effective features for this recognition system. The results of the proposed method in comparison with other similar methods based on genetic algorithm and binary particle method of optimizing indicates the effective performance of the proposed method. Manuscript profile
      • Open Access Article

        3 - A New Scheme for Automatic Classification of Power Quality Disturbances Based on Signal Processing and Machine Learning
        M.  Hajian A. Akbari Forod
        Identification and classification of power quality disturbances (PQDs) are one of the most important functions of monitoring and protection of modern power systems. One of the most important issues in PQ analysis is automatic diagnosis of waveforms using an effective al More
        Identification and classification of power quality disturbances (PQDs) are one of the most important functions of monitoring and protection of modern power systems. One of the most important issues in PQ analysis is automatic diagnosis of waveforms using an effective algorithm. This paper presents an effective method, for extracting features, using integration of discrete wavelet transform (DWT) and hyperbolic S transform (HST). Moreover, an efficient feature selection method namely Orthogonal Forward Selection (OFS) by incorporating Gram Schmidt (GS) procedure and forward selection is applied for selection of the best subset features. Multi support vector machines (MSVM), as famous classifier, is applied. Also, the variable parameters of the classifier are optimized using a powerful method namely particle swarm optimization (PSO). Six single disturbances and two complex disturbances as well pure sine (normal) selected as reference are considered for the classification. Sensitivity of the proposed expert system under different noisy conditions is investigated. Also, efficiency of the proposed methods by comparing the results of this study with the results of other papers is examined. Manuscript profile
      • Open Access Article

        4 - Introducing a New Version of Binary Ant Colony Algorithm to Solve the Problem of Feature Selection
        S. Kashef H. Nezamabadi-pour
        The use of metaheuristic algorithms is a good choice for solving optimization problems. In this paper, a novel feature selection algorithm based on Ant Colony Optimization (ACO), called Advanced Binary ACO (ABACO), is presented. This algorithm is an advanced version of More
        The use of metaheuristic algorithms is a good choice for solving optimization problems. In this paper, a novel feature selection algorithm based on Ant Colony Optimization (ACO), called Advanced Binary ACO (ABACO), is presented. This algorithm is an advanced version of binary ant colony optimization, which attempts to solve the problems of ACO and BACO algorithms by combination of these two. The performance of proposed algorithm is compared to the performance of Binary Genetic Algorithm (BGA), Binary Particle Swarm Optimization (BPSO), and some prominent ACO-based algorithms on the task of feature selection on 12 well-known UCI datasets. Simulation results verify that the algorithm provides a suitable feature subset with good classification accuracy using a smaller feature set than competing feature selection methods. Manuscript profile
      • Open Access Article

        5 - A Hybrid-Based Feature Selection Method for High-Dimensional Data Using Ensemble Methods
        A. Rouhi H. Nezamabadi-pour
        Nowadays, with the advent and proliferation of high-dimensional data, the process of feature selection plays an important role in the domain of machine learning and more specifically in the classification task. Dealing with high-dimensional data, e.g. microarrays, is as More
        Nowadays, with the advent and proliferation of high-dimensional data, the process of feature selection plays an important role in the domain of machine learning and more specifically in the classification task. Dealing with high-dimensional data, e.g. microarrays, is associated with problems such as increased presence of redundant and irrelevant features, which leads to decreased classification accuracy, increased computational cost, and the curse of dimensionality. In this paper, a hybrid method using ensemble methods for feature selection of high dimensional data, is proposed. In the proposed method, in the first stage, a filter method reduces the dimensionality of features and then, in the second stage, two state-of-the-art wrapper methods run on the subset of reduced features using the ensemble technique. The proposed method is benchmarked using 8 microarray datasets. The comparison results with several state-of-the-art feature selection methods confirm the effectiveness of the proposed approach. Manuscript profile
      • Open Access Article

        6 - Reduce Dimensions of CDF Steganalysis Approach Using a Graph Theory Based Feature Selection Method
        S. Azadifar S. H. Khasteh M. H. Edrisi
        The steganalysis purpose is to prevent the pursuit of steganography methods for your goals. In steganography, in order to evaluate new ideas, there should be known steganalysis attacks on them, and the results should be compared with other existing methods. One of the m More
        The steganalysis purpose is to prevent the pursuit of steganography methods for your goals. In steganography, in order to evaluate new ideas, there should be known steganalysis attacks on them, and the results should be compared with other existing methods. One of the most well-known steganalysis methods is CDF method that used in this research. One of the major challenges in the image steganalysis issue is the large number of extracted features. High-dimensional data sets from two directions reduce steganalysis performance. On the one hand, with the increase in the dimensions of the data, the volume of computing increases, and on the other hand, a model based on high-dimensional data has a low generalization capability and increases probability of overfitting. As a result, reducing the dimensions of the problem can both reduce the computational complexity and improve the steganalysis performance. In this paper, has been tried to combine the concept of the maximum weighted clique problem and edge centrality measure, and to consider the suitability of each feature, to select the most effective features with minimum redundancy as the final features. The simulation results on the SPAM and CC-PEV data showed that the proposed method had a good performance and accurately obtained about 96% in the detection of data embedding in the images, and this method is more accurate than the previously known methods. Manuscript profile
      • Open Access Article

        7 - Attribute Reduction Based on Rough Set Theory by Soccer League Competition Algorithm
        M. Abdolrazzagh-Nezhad Ali Adibiyan
        Increasing the dimension of the databases have involved the attribute reduction as a critical issue in data mining that it searches to find a subset of attributes with the most effectiveness on the hidden patterns. In the current years, the rough set theory has been con More
        Increasing the dimension of the databases have involved the attribute reduction as a critical issue in data mining that it searches to find a subset of attributes with the most effectiveness on the hidden patterns. In the current years, the rough set theory has been considered by researchers as one of the most effective and efficient tools to the reduction. In this paper, the soccer league competition algorithm is modified and adopted to solve the attribute reduction problem for the first time. The ability to escape the local optimal, the ability to use the information distributed by players in the search space, the rapid convergence to the optimal solutions, and the low algorithm’s parameters were the motivation of considering the algorithm in the current research. The proposed ideas to modify the algorithm consist of utilizing the total power of fixed and saved players in calculating the power of each team, considering the combination of continuous and discrete structures for each player, proposing a novel discretization method, providing a hydraulic analysis appropriate to the research problem for evaluating each player, designing correction in Imitation and Provocation operators based on the challenges in their original version. The proposed ideas are performed on small, medium and large data sets from UCI and the experimental results are compared with the state-of-the-art algorithms. This comparison shows that the competitive advantages of the proposed algorithm over the investigated algorithms. Manuscript profile
      • Open Access Article

        8 - Diagnosis of Attention-Deficit/Hyperactivity Disorder (ADHD) based on Variable Length Evolutionary Algorithm
        M. Ramzanyan Hussain Montazery Kordy
        The methods used today to investigate brain connections to diagnose brain-related diseases are the imaging method of resting magnetic resonance imaging. In this paper, a new method is proposed using an evolutionary variable-length algorithm to select the appropriate fea More
        The methods used today to investigate brain connections to diagnose brain-related diseases are the imaging method of resting magnetic resonance imaging. In this paper, a new method is proposed using an evolutionary variable-length algorithm to select the appropriate features to improve the accuracy of the diagnosis of healthy and patient-to-patients with attention deficit hyperactivity disorder based on analysis of rs-fMRI images. The characteristics examined are the correlation values between the time series signals of different regions of the brain. Selection of the variable-length property were based on the honey bee algorithm in order to overcome the problem of feature selection in algorithms with fixed-length vector lengths. The Mahalanubis distance has been used as a bee algorithm evaluation function. The efficiency of the algorithm was evaluated in terms of the value of the evaluation function in the first degree and the processing time in the second degree. The results obtained from the significantly higher efficiency of the variable-length bee algorithm than other methods for selecting the feature. While the best result of the overall categorization accuracy among the other methods with the 26 selected characteristics of the PSO algorithm is 76.61%, the proposed method can achieve a total classification accuracy of 85.32% by selecting 25 features. The nature of the data is such that the increase in the number of attributes leads to a greater improvement in the accuracy of the classification so that by increasing the length of the characteristic vector to 35 and 45, classification accuracy was 91.66% and 95.57% respectively. Manuscript profile
      • Open Access Article

        9 - A Feature Selection Algorithm in Online Stream Dataset Based on Multivariate Mutual Information
        Maryam Rahmaninia Parham Moradi
        Today, in many real-world applications, such as social networks, we are faced with data streams which new data is appeared every moment. Since the efficiency of most data mining algorithms decreases with increasing data dimensions, analysis of the data has become one of More
        Today, in many real-world applications, such as social networks, we are faced with data streams which new data is appeared every moment. Since the efficiency of most data mining algorithms decreases with increasing data dimensions, analysis of the data has become one of the most important issues recently. Online stream feature selection is an effective approach which aims at removing those of redundant features and keeping relevant ones, leads to reduce the size of the data and improve the accuracy of the online data mining methods. There are several critical issues for online stream feature selection methods including: unavailability of the entire feature set before starting the algorithm, scalability, stability, classification accuracy, and size of selected feature set. So far, existing methods have only been able to address a few numbers of these issues simultaneously. To this end, in this paper, we present an online feature selection method called MMIOSFS that provides a better tradeoff between these challenges using Mutual Information. In the proposed method, first the feature set is mapped to a new feature using joint Random variables technique, then the mutual information of new feature with the class label is computed as the degree of relationship between the features set. The efficiency of the proposed method was compared to several online feature selection algorithms based on different categories. The results show that the proposed method usually achieves better tradeoff between the mentioned challenges. Manuscript profile
      • Open Access Article

        10 - Feature Selection and Cancer Classification Based on Microarray Data Using Multi-Objective Cuckoo Search Algorithm
        kh. Kamari f. rashidi a. Khalili
        Microarray datasets have an important role in identification and classification of the cancer tissues. In cancer researches, having a few samples of microarrays in cancer researches is one of the most concerns which lead to some problems in designing the classifiers. Mo More
        Microarray datasets have an important role in identification and classification of the cancer tissues. In cancer researches, having a few samples of microarrays in cancer researches is one of the most concerns which lead to some problems in designing the classifiers. Moreover, due to the large number of features in microarrays, feature selection and classification are even more challenging for such datasets. Not all of these numerous features contribute to the classification task, and some even impede performance. Hence, appropriate gene selection method can significantly improve the performance of cancer classification. In this paper, a modified multi-objective cuckoo search algorithm is used to feature selection and sample selection to find the best available solutions. For accelerating the optimization process and preventing local optimum trapping, new heuristic approaches are included to the original algorithm. The proposed algorithm is applied on six cancer datasets and its results are compared with other existing methods. The results show that the proposed method has higher accuracy and validity in comparison to other existing approaches and is able to select the small subset of informative genes in order to increase the classification accuracy. Manuscript profile
      • Open Access Article

        11 - Efficient Recognition of Human Actions by Limiting the Search Space in Deep Learning Methods
        m. koohzadi N. Moghadam
        The efficiency of human action recognition systems depends on extracting appropriate representations from the video data. In recent years, deep learning methods have been proposed to extract efficient spatial-temporal representations. Deep learning methods, on the other More
        The efficiency of human action recognition systems depends on extracting appropriate representations from the video data. In recent years, deep learning methods have been proposed to extract efficient spatial-temporal representations. Deep learning methods, on the other hand, have a high computational complexity for development over temporal domain. Challenges such as the sparsity and limitation of discriminative data, and highly noise factors increase the computational complexity of representing human actions. Therefore, creating a high accurate representation requires a very high computational cost. In this paper, spatial and temporal deep learning networks have been enhanced by adding appropriate feature selection mechanisms to reduce the search space. In this regard, non-online and online feature selection mechanisms have been studied to identify human actions with less computational complexity and higher accuracy. The results showed that the non-linear feature selection mechanism leads to a significant reduction in computational complexity and the online feature selection mechanism increases the accuracy while controlling the computational complexity. Manuscript profile
      • Open Access Article

        12 - Introducing Intelligent Mutation Method Based on PSO Algorithm to Solve the Feature Selection Problem
        Mahmoud Parandeh Mina Zolfy Lighvan jafar tanha
        Today, with the increase in data production volume, attention to machine learning algorithms to extract knowledge from raw data has increased. Raw data usually has redundant or irrelevant features that affect the performance of learning algorithms. Feature selection alg More
        Today, with the increase in data production volume, attention to machine learning algorithms to extract knowledge from raw data has increased. Raw data usually has redundant or irrelevant features that affect the performance of learning algorithms. Feature selection algorithms are used to improve efficiency and reduce the computational cost of machine learning algorithms. A variety of methods for selecting features are provided. Among the feature selection methods are evolutionary algorithms that have been considered because of their global optimization power. Many evolutionary algorithms have been proposed to solve the feature selection problem, most of which have focused on the target space. The problem space can also provide vital information for solving the feature selection problem. Since evolutionary algorithms suffer from the pain of not leaving the local optimal point, it is necessary to provide an effective mechanism for leaving the local optimal point. This paper uses the PSO evolutionary algorithm with a multi-objective function. In the proposed algorithm, a new mutation method that uses the particle feature score is proposed along with elitism to exit the local optimal points. The proposed algorithm is tested on different datasets and examined with existing algorithms. The simulation results show that the proposed method has an error reduction of 20%, 11%, 85%, and 7% in the Isolet, Musk, Madelon, and Arrhythmia datasets, respectively, compared to the new RFPSOFS method. Manuscript profile
      • Open Access Article

        13 - Improving IoT Botnet Anomaly Detection Based on Dynamic Feature Selection and Hybrid Processing
        Boshra Pishgoo Ahmad akbari azirani
        The complexity of real-world applications, especially in the field of the Internet of Things, has brought with it a variety of security risks. IoT Botnets are known as a type of complex security attacks that can be detected using machine learning tools. Detection of the More
        The complexity of real-world applications, especially in the field of the Internet of Things, has brought with it a variety of security risks. IoT Botnets are known as a type of complex security attacks that can be detected using machine learning tools. Detection of these attacks, on the one hand, requires the discovery of their behavior patterns using batch processing with high accuracy, and on the other hand, must be operated in real time and adaptive like stream processing. This highlights the importance of using batch/stream hybrid processing techniques for botnet detection. Among the important challenges of these processes, we can mention the selection of appropriate features to build basic models and also the intelligent selection of basic models to combine and present the final result. In this paper, we present a solution based on a combination of stream and batch learning methods with the aim of botnet anomaly detection. This approach uses a dynamic feature selection method that is based on a genetic algorithm and is fully compatible with the nature of hybrid processing. The experimental results in a data set consisting of two known types of botnets indicate that on the one hand, the proposed approach increases the speed of hybrid processing and reduces the detection time of the botnets by reducing the number of features and removing inappropriate features, and on the other hand, increases accuracy by selecting appropriate models for combination. Manuscript profile
      • Open Access Article

        14 - Multi-Label Feature Selection Using a Hybrid Approach Based on the Particle Swarm Optimization Algorithm
        َAzar Rafiei Parham Moradi Abdolbaghi Ghaderzadeh
        Multi-label classification is one of the important issues in machine learning. The efficiency of multi-label classification algorithms decreases drastically with increasing problem dimensions. Feature selection is one of the main solutions for dimension reduction in mul More
        Multi-label classification is one of the important issues in machine learning. The efficiency of multi-label classification algorithms decreases drastically with increasing problem dimensions. Feature selection is one of the main solutions for dimension reduction in multi-label problems. Multi-label feature selection is one of the NP solutions, and so far, a number of solutions based on collective intelligence and evolutionary algorithms have been proposed for it. Increasing the dimensions of the problem leads to an increase in the search space and consequently to a decrease in efficiency and also a decrease in the speed of convergence of these algorithms. In this paper, a hybrid collective intelligence solution based on a binary particle swarm optimization algorithm and local search strategy for multi-label feature selection is presented. To increase the speed of convergence, in the local search strategy, the features are divided into two categories based on the degree of extension and the degree of connection with the output of the problem. The first category consists of features that are very similar to the problem class and less similar to other features, and the second category is similar features and less related. Therefore, a local operator is added to the particle swarm optimization algorithm, which leads to the reduction of irrelevant features and extensions of each solution. Applying this operator leads to an increase in the convergence speed of the proposed algorithm compared to other algorithms presented in this field. The performance of the proposed method has been compared with the most well-known feature selection methods on different datasets. The results of the experiments showed that the proposed method has a good performance in terms of accuracy. Manuscript profile